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This  is  th«  final  report  on  the  work  done  under  the  Flexible  Memory  Systems  AASERT  fellowship 
(Grant  #F49620-94-l*0462).  During  the  funding  period,  progress  was  made  in  two  major  areas: 

.  M-Machine  Hardware:  The  memory  system  arehiteeture  of  the  M- Machine  was  fin^ized.  RTL 
(Verilog)  implementation  of  the  memory  system  was  continued.  (RTL  implementation  began  before 
the  staft  of  the  funding  period)  At  the  end  of  the  funding  period.  RTL  models  of  the  .M-Machine  s 
cache  and  external  memory  interface  had  been  developed.  The  external  memory  interface  had  been 
integrated  with  a  model  for  the  M-Machine's  off-chip  memory,  and  integration  of  the  cache  banks 
and  exletual  memory  interface  had  begun, 

•  Runtime  Software:  System  software  for  the  M-Machine  was  developed  which  use  the  novel  features 
of  the  M-Machine  to  implement  shared  memory  b  software  with  hardware  assistance,  demonstrating 
that  the  M-Machine’s  memory  mechanisms  are  sufficient  for  implementation  of  shared  memory. 
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Abstract 

This  is  the  final  report  on  the  work  done  under  the  Flexible  Memory  Systems  AASERT  fellowship 
(Grant  #F49620-94-l-0462).  During  the  funding  period,  progress  was  made  in  two  major  areas. 

.  M-Machine  Hardware:  The  memory  system  architecture  of  the  M-Machine  was  finalized.  RTL 
(Verilog)  implementation  of  the  memory  system  was  continued  w  m"  hlte’< 

he  taf i  of  L  funding  period)  At  the  end  of  the  funding  penod  RTL 

cache  and  external  memory  interface  had  been  developed.  The  c.xternal  memory  interface  had  been 
integrated  with  a  model  for  the  M-Machinc’s  off-clup  memory,  and  tntegratiou  of  tlio  cache 
and  external  memory  interface  had  begun. 

.  Runtime  Software’  System  software  for  the  M-Machine  was  developed  which  use  the  novel  features 
ff  the  M-Machine  to  implement  shared  memory  in  software  with  hardware  assistance,  demonsiraUng 
that  the  M-Machine>s  memory  mechanisms  are  sufficient  for  implementation  of  shared  memory. 


1  Introduction 

“Ubl^ry  J,  th'  «  “  ■>».  of  bio  onlbin,  »o,k  ,,  ,b.  ^ombU 

.,.un,  ..chiUbt  ^  „  lmt.i=m....«5  ob.M 

melry  iS  of  providing  h«dware  to  support  one  specific  shared  memory  model,  the 
memory.  which  accelerate  a  number  of  functions  which  are  common  to  many  shared 

provides  hardware  M-Machine's  hardware  to  determine 

sTrit" ,  b  .btch  bbb=.  u.bu  d...  i-";.;-;”: 

•  1 1  •  j  A  tirkU*  references  require  software  intervention,  in  is  moael  aJl 

reduce  the  frequency  with  which  software  is  invoked,  and  the  run-time  of  the  handlers  when  they 
invoked. 

2  Progress  and  Accomplishments 

The  AASERT  award  covered  work  on  the  memory  system  of  the  M-Machine  as  part  of  the  M-Machme 

-bS  L.»u«  ..  tbb.  d.,.  D„i„5  .K.  P»bd  ccrjl  b,  .be  ^prb.„»  w„  ».db  ,b  .b, 
Le«  of  architectural  refinement.  RTL  Implementation,  and  Run-Time  Software. 


1 


07/23/96 


10:07  ®617  253’4734  MIT  OSP  E  19  719  0007/008 

» 


2.1  Architectural  Refinement 

Durinc  the  period  covered,  a  number  of  relinemenw  “rre  ma<ic  to  the  architecture  of  the  M.M.arhmes 
memory  system.  Due  to  ilie  fact  that  the  basic  arciiiiectnre  of  the  .M-Marliine  w.x<  f.mtiy  v.-cll-.leftiictl, 
the  ardiiiccturai  refmements  that  were  made  v.-erc  minor  relincmmit.s  a.s  opposed  to  major  ciiaiigtas. 
llcKiicmeiits  to  the  M-Machine  architecture  dure.g  this  time  period: 

•  The  external  memory  system  was  made  responsible  for  geiieraiin®  ail  event  fiueuc  entries  tor  etents 
raised  by  the  memory  system.  It  tlte  cache  banks  detect  an  event,  the  request  is  forwarded  to  the 
EMI  for  handling.  Previously,  the  architecture  had  called  for  the  cache  b.viks  to  generate  event 
queue  entries  for  events  that  they  detect.  This  architecture  change  was  matlc  to  simplify  tl.e  cache 

banks. 

.  The  event  queue  format  was  changed  to  improve  the  performance  of  the  runtime  software  by  having 
ihe  hardware  generate  data  that  the  runtime  software  was  having  to  compute,  .and  mciudiug  bo^ 
the  physical  and  virtual  addresses  of  a  memory  event  in  the  event  queue  entry  to  eiimmaie  the  need 
for  the  event  handlers  to  perform  an  address  translation- 

•  The  PUTCSTAT  instruction  was  revised  so  that  it  atomically  flushes  a  data  block  from  the  on- 
chip  cache,  changes  the  block  status  of  that  block  to  the  specified  value,  and  returns  the  previous 
block  status  of  the  block.  This  change  was  made  because  the  runtime  software  needs  the  ability  to 
determine  the  status  of  a  cache  block  immediately  before  the  block  is  invaUdated  in  the  cache,  and 
providing  an  atomic  instruction  to  do  this  was  the  only  way  to  provide  this  funciiou. 

2.2  RTL  Implementation 

The  RTL  implementation  of  the  M-Machine  continued  throughout  the  funding  period.  During  the  funding 
period,  the  RTL  for  the  cache  banks  and  the  EMI  were  completed  to  a  .sufneient  Icyel  that  integration 
testing  of  the  entire  memory  system  could  begin.  By  the  end  of  the  funding  period,  the  integrated 
memo’rv  system  could  perform  reads  and  writes  to  both  the  caches  and  the  external  memory,  and  could 
detect  and  handle  events  by  writing  event  queue  entries  into  the  event  queue.  The-.^chematic  design  for 
the  cache  datapath  was  begun  during  the  funded  period,  and  completed  shortly  after  tlie  funded  period. 


2.3  Run-time  Software 

During  the  funded  period,  work  was  begun  on  implementing  the  run-time  .system  of  the  .M-Machine. 
including  software  handlers  to  allow  the  execution  of  shared-memory  programs  on  the  M- Machine.  I  wo 
remote  memory  handlers  have  been  implemented  on  the  M-Machinc,  one  which  implemeni^  a  cached 
shared-memory  scheme,  and  one  which  implements  an  uncached  shared-memory  scheme.  The  iitiple- 
mentation  of  these  memory  handlers  serves  to  drive  the  leRnement  of  the  memory  system  architecture 
and  to  demonstrate  that  the  M-Machine's  memory  system  mechanisms  can  be  used  to  implement  sharetl 
memory  successfully- 


3  Publications 

Two  major  publications  were  made  based  on  the  work  covered  by  this  award.  In  10/94.  the  paper  “Hard¬ 
ware  Support  for  Fast  Capability-Based  Addressms,”  which  describes  the  memory  protection  scheme  of 
the  M-Machine,  was  published  in  the  proceedings  of  the  6ih  International  Conference  on  .A.rchitecWr^ 
Support  for  Progtamming  Languages  and  Operating  Systems  (.A.SPLOS  VI).  In  11/93.  the  paper  The 
M-Machinc  Multicomputer,*  by  Marco  FiUo  et.  al.  was  published  in  the  MICRO-28  conference.  This 
paper  described  the  architecture  of  the  M-Machine,  including  a  description  of  the  memory  system  and 
preliminary  results  from  the  run-time  system.  Copies  of  both  of  these  papers  are  included  w,tli  this 

report. 
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4  Conclusion 

During  the  period  covered  by  this  award,  a  number  of  refinemcnis  were  made  to  tlie  memory  sysiein 
architecture,  significant  progress  was  made  on  the  implemenutioii  of  the  M- Machine  hardware,  and 
run-time  software  was  written  to  demonstrate  the  feasibility  of  the  M-.Machinc's  memory  system  ns  a 
platform  for  implementing  shared  memory. 


ilOOS/008 


3 


07/23/96  10:06  0617  253  4734  MIT  OSP  E  19  719 

,JUL  15  ’SS  12:80  F=  MITRE  ,  515  3r-3  TD  5 

ScNT  3Y:Mir/Al  U30RAT0RY  ?  7-T8-3;  5  4=1^  ? 


@004/008 

isi  72535888  P. 02/03 
•  3194553943:87 


fl?/li/88  mi  tnl  253  4734 


ilT  OS?  £  li  Til 


@g«£M2 


ssairr  Kwss^r: 


TQSXL  12-2  . 

HyeygnaTTwr  xynsss  TcR  gCTTO  9  .CaasaT? 

tart 


■SSia  fi«p*rte«at  o<  B«fw»»e  (BOD)  raqoirw  esrtaii  iaioaitiou  sa  e^lnat*  tie 

erf octlvanesa  «r  tJi«  AISBRX  ptosxaa.  sy  aecrptfag-  thia  e»a*  Ko^icsiti^, 

wt^icbL  l»«9tisvs  tla  A&SZRZ  £vids,  tha  Crascsa  a^raea  Co  provide  tBe  inf sraacios 
r«qB«*t«d  belov  to  the  Covoisaoafe'a  toefcaicai  point  ef  costact  by  eaeb  anrmo.'* 
azmivoraBSY  of  tbo  alSSS  avtn  data. 

toaai  m  t*9*  1  of  eetss) 


1.  flrantea  identifleation  datos  (r  *  r  »• 

Maasgchusetta  Institute  of  Teehaology 


Dniverai^  Hane 

b.  y4962D~94-l-0462- 


d. 


drant  thasbor 
William  J.  Dally 


9.Z.  Kama 


R  S  S  Rusher. 

g^.  3/1/95  .Pa.  2/29/56 
lASSSsr  supporting  Poried 


nrtfi  trwit  t» 


Tc  AtcidM  U  t» 


2,  Total  StftdSag  of  tho  Parent  Igreoaeot  and  tha  niiahor  of  Jttll-tina 
otfttivalont  graduata  stadoata  (TTBGS)  Btpported  by  the  parent  Agreeseat  <iT=ri-c- 
the  i2-»oiiita  period  prior  ta  the  aaszp;  avard  date. 


a-  Tuadinoi 
b.  Kuaber  rxrtsd: 


S  877.494 


3 .  Total  nanding  of  the  Pazant  Agreeaont  and  tie  osaher  or  FTSSS  support 
by  the  Parent  Agreement  dtriag  tie  earreat  i2-menti  reporting  parlod. 

a.  Fondingi  s  94#553  ; 

b.  Kiaber  Pssea* 


4«  So^al  AASSKT  wd  mrabes'^S  TTSCS  And 

(TTGS}  troppcrtad  by  AASECT  dnriay  csrrte-  12-acmth  rtportiag  periei- 

ae  fmdiaf:  ?  78 1753 

b.  Ruabecr  rXSSS:  ^  • 

C.  Kuaber  UdS:  * 

vCTigrdACToy  gragatEgrt  .  Z  iexsby  verify  that  all  students  supported  by  tic 
AASaa  nvyrd  are  ^cirens* 


